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ABSTRACT 


Three alternative procedures (Delta, Jackknife, Boot- 
Strap) were investigated and compared with respect to their 
confidence interval estimation of survival probability of a 


system. Numerical results from simulations are presented in 


eaas report. 
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Tse ROP Ue TION 


A. OVERVIEW 

A common problem in various areas of operations research 
and applied statistics, e.g. in reliability and maintaina- 
bility studies, 1s that of predicting from available data 
the probability that a future observation exceeds a given 
wee. an CXample arising in nuciear plant reliability is 
that a crucial repair or down time exceeds h (= 4.) hours. 
Another problem is to predict the "100-year flood", or earth- 
quake, etc. The latter problem is difficult because there 
will usually be far less than 100-years worth of data to work 
with. Still another problem is that of predicting the pro- 
bability of survival for h = 5 years for a cancer victim re- 
ceiving a particular treatment. 

The simplest formulation is to assume that the data is a 
random sample from a probability distribution F(x) (con- 
tinuous, i.e. having a density), with density f(x). That 
is, observed values are Xpr Xone 2 2 6 Kye being independent 
realizations of independent identically distributed random 
variables generically denoted by X. I£ one is willing to 
assume also that the mathematical form of F(x) = P(x; 9) 1s 
known (e.g. 1s Log-normal, or Gamma, or another candidate) 


then what one can do is: 





(a) Estimate the possibly multidimensional parameter 6 


eng. Sweouwldebe u, oan the population mean and variance for 


a log-normal model, estimated by 1lnx = and 
S 2.1 , (Inx. - Lnx) * it lcall 
Taee n-1 .4, A n classically). 


(b) Quote the point estimate Pi(x78), Or, in the present 
@ase 1 - Fh; 9) f£0r probability of survival beyond h. 

(c) Utilize facts about the sampling distribution of 6 
EOurInd ade standara error or confidence limits on FL (h;8), the 
euevival probability. 

The basic assumption, then, is that data can reasonably 
be assumed to be a random sample from a fixed distribution, 
the form of which is known. There are various ways in which 
such convenient assumptions can be violated, one obvious one 
being that the fixed distribution idea is not justifiable 
(perhaps because of important detectable variation in the 
Pieeerolelon f£7om lecation to location, or plant to plant, 
from repair crew to repair crew, etc.). Another might be 
that some data points are misSing: too-short ones (down 
times) are not written down or else are recorded incorrectly, 
and too-long ones are regarded as being so exceptional as 
never to recur, and hence are removed. Possible or likely 
departures from the basic assumption should be investigated. 
The raw data should be carefully examined in an exploratory 


Spirit (see J. W. Tukey [Ref. 1]), e.g. by graphics to check 





for departures from the basic “stationary" assumption. In 
this discussion we rule out such variations. 

This paper gives an account and some evaluation of 
several different ways of accomplishing step (c) above (con- 
fidence limits for the probability of survival or excedance 
of time h and related topics). It will discuss four differ- 
ent methods for attacking the estimation and confidence 
Jamits problem. 

1. Mathematical Formulation 


We shall assume that (x, ,x .,X_) are the complete 
1 n 


gree 
times of repair (or down times), and that they are indepen- 
dent realizations of the generic random variable X, where 

Y = 1lnX is normally distributed with mean u and variance 5? , 
both unknown. This kind of assumption is often made in prac- 
tice. This implies that the probability that a randomly se- 


lected, future, down or repair time exceeds h is given by the 


formula 








re D r 2 
ae =i 
= De, 22 dz 8 22 az 
ax) a = e A 
= Or q /27 


Inh- u hea 


Vo" 


In practice, this formula is not immediately applica- 


ble when u and ae are unknown but if we estimate 


ye jo ae 
= d Inx,; = Inx flees) 





n 
ee Fe wee Z 
Co” = =] 24 (nx, Lnx) , lk 3) 


we can go ahead and quote a point estimate; the latter de- 


pends on 


(1.4) 


If we examine this quantity (integration limit) it 
will be seen to be a Single realization of a random variable 


written as 


(is) 





2 is ota, Mejeses Lowel sey! 


where Y is N(u,07) and Sy 
Chi-squared r.v., the latter being independent of Y by the 
convenient (log) normal assumption. Now re-write @ as 


wa dal aca cal Vit) (1.6) 


Jen, ae. 
n 


Tf (H-u) = 0 then (-@/n) would be precisely distri- 


buted as a Student's t. On the other hand 


——_ (Y-ut+u-H) Vn 
S 
y 


-@y 


If we write 6 = u-H then 


(Y-u+S)/Yn 


S 
y 


-3/n = 





has a known density function, that of the Non-central t which 
is conveniently expressed in terms of the non-centrality 


parameter 





Classical methods exist for utilizing this to estab- 
lish tolerance limits. In this paper a different approach 
is followed. We examine the performance of several conve- 
nient approximate methods for assessing the uncertainty in 
the simple point estimate (1.1), where estimate (1.2) and 
(1.3) are used for the parameter values. These methods are 
tm Deca method (linearization), the Jackknife, and the 
Bootstrap, as well as a completely distribution-free (Ber- 


noulli trials) method. Details now follow. 


B. PURPOSE AND APPROACH 
Pe Dtsceeemeten-Free Approach 
In general, suppose we want to solve the problem of 
estimating the survival probability without any distribu- 
tional assumption, other than that observations are 1id. The 
Simplest way is to use the binomial approach. If (X1pX5re-e 
x) indicates the iid. sample of down or repair times, we can 


estimate P(x>h), survival probability, by means of 


ee See) _ 


= (os) 


LT@ I>. 


P[X>h] 


Then we can set up a confidence interval for the sur- 


Voval paeee@bility (1.9) by making use of the fact that for 


10 





large n the binomial distribution can be approximated by a 
normal distribution. An approximate (l-a) -100% confidence 


interval for the binomial parameter p is given by 


A. pa - 5) 2 Nok = ol 
p 20/2 = <p < 60+ 2/2 = (ae. 0) 
where Zs2 (1 - 4/2) 100% point of the tabled unit normal. 


2. Maximum Likelihood Approach 


We can assume along with others, that repair time 
data comes as a random sample from a log-normal population: 
Y = 1lnX where X is Normal (u,o%). This assumption will be 
crucial in all three methods. Then the maximum likelihood 


estimates (M.L.E.) of the parameter are as stated before: 


n 
ee 
De es Ue (1.11) 
i=l 
2 1 we =e 2 ! 
6) aa ye Srey) Sos (142) 
i=l 


Serleccwy speaking, oe Gee etcoeene MLB. muolewolired 
by (n/n-1) and is unbiased for ie Furthermore, in repeated 
samples of size n we have that "exactly" (assuming the model 


jisacornect) : 


2 
(i) le © ada Normal (u,=—) (ks) 
Zee 
<A oe. 2 ORE ol, 
(nas) Cc" = S, LS = ae (ea) 
where E(¢7] = 0%, and Var[é*] = 20°/(n-1) 
(211i) ft and 34 are statistically independent. 


J 








Thus for large n both f and a¢ tend to be close to 
their respective population values, guaranteeing a good ap- 
proximation to the survival probability if model (1) is cor- 
rect. Now according to the assumed model, the probability 


of exceeding h hours is 


~32° ae 
(esr .. Eee (lS) 
V2 


The maximum likelihood estimate of this probability 


is obtained by replacing 4 by 4, oo by oe 


2 
“ "AZ ap 
P(x>h) = e aeaaeas Ee Iuls) 


Now find upper and lower limits for the parameter: 


q = GL) 


i.e. G and g are functions of the observations such that 
gq . gq = G with prescribed probability (1l-a)-100%, say 953. 
These then translate into upper and lower limits on the 


probability of exceeding h 


r 2 2 
-15Z —5Z 
|. dz < p(x>h) < ie GZ Tear) 
J Voan Vv2T 





EZ 





If we compute gq and q from a sample, then, under the 
initial assumptions, we have the desired confidence limits 
mem tie probability of exceeding h. 

3. Delta Method (DL) 

The delta method is an approximate way of finding 

Bae Gistribution of gq. It is known that functions such as g 


are approximately normally distributed for "sufficiently 


large" n (see Cramer [Ref. 2]). We estimate q by 
g = wt (yeag) 


aniad wse tae “delta method”, or method of linearization, or 
small errors, to estimate the variance: 


ee: 2 
var[4] = (23) var[fi] + a var Le? ] (1.20) 


There is no covariance term because of the (theoretical) in- 
dependence of fi and ee see (111i) above in section 2. This 


formula yields 








= ain =2, var [02] _ 207 CS one (1.21) 
a: ; o 98° 2-(8*) 
SO 
var{[q] = — of , 1 (inh ~ 0)? 20° ieee) 
varia] z+ 2.danpcew* a), oddiaet) | get (4,23) 
n 2 oo es) n 2 3¢ q ° 


rs 





Assume q can be taken to be normal with mean gq and 


2 ; ; a 
varlance SS and quote these approximate confidence limits: 


Ip, = 4 + 24 ~o/2'N8q. (1.24) 
ee | egy Oy Bet (1.25) 


This translates into the desired (but approximate) 


confidence limits for the probability of exceeding h: 


2 zs 
= zs 
HO Bins g aE" le 
2 aS SIP (oe eg) e —— neon) 
q ¥v2TI ¥2T7 


Several approximations have been made in the process 
described and the validity, for moderate n, of such a rela- 
tively simple process, must be checked. Notice that the 
exact distribution of q is non-central t under the basic 
model assumption. This approach replaces the n-c.t by a 
convenient normal approximation. 

4. Jackknife Method (JK) 
The jackknife is an alternative way of putting con- 


fidence limits on the parameter 


z Inh - u 
le] 


For further discussion see Mosteller and Tukey [Ref. 


3] and Efron [Ref. 4]. In brief, the jackknife method has 


the capacity to reduce the bias of estimates of such quantities 


14 





and, more importantly, to furnish confidence limits that be- 
have ina satisfactory manner. 

Jackknife estimates and confidence limits are con- 
structed by successively leaving out parts of the available 
Gata to construct pseudovalues. These are then averaged, and 
the stability of the average assessed by use of Student's t 
or the Normal in order to obtain confidence limits. The pro- 
cedure 1s given below for our case: 


(1) Form the estimate 


Aas vy. 
Gn l¥pr¥orVar---s¥y) Ss, : (LEW) 


This is the m.l.e. using all the data, just as before. 
(2) Form the estimates Sess es ey pee 


y : 


greece tn) ewe te Ms these are similar to qn 


but omit successively each single observation Yur Yoreeer¥pi 
at the next stage each observation is then restored and the 
following taken out, as 1 runs from 1 to n and thus there 


are n values sl eae 


(3) Compute the pseudovalues as follows: 
slo ee IS LCi ae = eee oe. es > Te (1.28) 
(4) Compute the mean and variance of the vseudovalues: 
- , US 
t= = .4y (1.29) 
1=1 
n 
Zeek oe 
= Rae, a (Cu. u) (1.30) 


> 





(5S) Approximate (accuracv increasing with n in- 


creasing) (l-a)-100% confidence limits for q are given by: 
= es, Wee = < < 47 nee = = 


where fy s2 (not) is the (1 -a/2)°100 percent point of Student's 
Eeicmwe “Standard, central, distribution). Also we can use 2/2 
as before as an option. 

(6) This means that, with approximate (1-a) -1003% 


confidence, the probability of survival is between the two 


Confidence limits that follow: 





~ y) re 2 
Ee | BES a7 
e = Pixeh) —s e — (eie3 2) 
- Y2TT Day 
oe aK 


This procedure, based on the m.l.e., has been theo- 
retically validated for large n. It competes with the delta 
method, but is somewhat more difficult to carry out. 

S- eeeeotstecap Method (BT) 

The bootstrap method (see Efron (1979) [Ref. 4]) is 
Similar to the jackknife method, but differs in being a re- 
sampling procedure. The procedure is given as below for our 
case: 


Gare calculate 


a _ inh -y 
G= Gil¥y+Yor---s¥y) 


This is the m.l.e. using original data, same as before. 


ims) 





(2) SiGaw a Bootstrap sample” ,-using Y,1¥Yor---++¥, as 


basic distribution, value each having probability 1/n 
fs «3 ---+y7 


Tk 
and calculate u = ae >< -1ry) a e Sey 
y 


(3) Independently repeat step (2) a large number of 
times, 5, Obtalnimg “bootstrap replications" us, ae ii peo meee 


and calculate 


ae 1 B 
uss p u. (1.35) 


(i = (eso) 


Pweeeapproeximate (l—-a)-100% confidence limits for q. 
Here are four different approaches: 
(a) Non-Parametric Approach (BT1): Take the 
Smger statistics Of bootstrap sample 


ul <u eee 5) 


then let j = [5-3], and take as confidence limits for q: 


< < - 7 
pie. 2 weet) > BT 


(1.37) 
(ob) Normal Approximation Approach (BT2): If we 

assume the bootstrap sample is approximately normally dis- 

tributed then approximate confidence limits for q can be set 


down: 


(138) 


ny 





(c) Bias-Adjusted Non-Parametric Approach (BT3): 


The bootstrap estimate of bias is 


BIAS =u - G 
where G is the estimate of q from the original data. Con- 


fidence limits for this case: 


Stas <q < Stas 
7 Sed = Began 


Aer = ta) (1.39) 


(d) Bias-Adjusted Normal Approximation Approach 


(BT4): In this approach the confidence limits are: 
= (q-BIAS) a Sa 3 eageeey e =g (1.40 
dpaqo ‘4 mievyeeemeeet ‘2 PIAS) + 2s _ 4/2 °Sy>Fap so) 


Some Simulation results for these four approaches 
will be presented in the analysis section. Identification 


for these cases are BTl, BT2, BT3, BT4. 
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IL. SIMULATION PROCEDURE 


A simulation procedure has been used to compare the three 
methods for obtaining confidence intervals for the probability 
a repair time exceeds h = 4 hours in the case in which the 
log-normal assumption is met and other cases in which it 
isn't (the exponential and long-tailed exponential). Speci- 
fically, simulation has been used to compute 

(a) The actual coverage of the true survival probability 
by the confidence intervals given by the procedures 


under study, when the nominal coverage is (l-a) °-100%. 


(b) Measure of confidence interval size: the expected 
width and standard deviation of width. 


The simulation programs were written in FORTRAN IV, and 
the simulations have been carried out on the IBM 3033 at the 
Naval Postgraduate School. The Naval Postgraduate School 
LLRANDOM package was used, along with the International Mathe- 
matical and Statistical Library (IMSL) random generator; 1000 
replications were used to evaluate each procedure in each 
distributional situation. Also B = 200 bootstrap replications 
were taken for each trial, four sample sizes, n = 10, 20, 30, 
40, and h = 4 hours and three distributions: Log-normal, Ex- 
ponential, and a Long-tailed Exponential were investigated. 

An outline of the simulation procedure now follows: 

(A) Log-normal: In this case the basic variables, the 


down or repair times, are i.i.d. log-normal (see Appendix B). 


Jie, 





This case was simulated for the following "population" pa- 


rameter values: 


oe p= 1. co = 1 
(2iaei= 1. we = lig 
(3) w=l. of = 31n2 
(4) usl. of = 1n2/3. 


The procedures of the previous section were used to ob- 
tain confidence intervals for the probability a repair time 
exceeds h = 4 hours. 

(B) Stretched long-tailed exponential: Down or repair 
times come independently from a stretched long-tailed expo- 
nential; see Appendix B. The simulated data was treated as 
tf it was a sample from a log-normal distribution and proce- 
dures of the previous section were carried out. In this 
Simulation we used the log transformation to tend to convert 
the long-tailed exponential observations towards normality 
(svmmetrize them). The stretched long-tailed exponential 


model is: 
Mee ALi] Fez) (20) 


X is stretched long-tailed exponential where Z@ has the Ex- 
Bem!) Gieeribution. Simulation was carried out for A=3.225, 
C=0.1948. These values were taken in order to compare the 
results with Exponential (A = 0.22) case. 

(C) Exponential down or repair time: In this situation 


Simulations were carried out for two cases. First taking the 


20 





log of exponential down or repair times, and treating these 

as having the normal distribution; that is treating the data 
as log-normally distributed. Second, taking the p power of 
data values, and then treating the transformed values as nor- 
mally distributed. Also Appendix C gives an algorithm for 
estimating the p value from the data. Simulations are also 
carried out for p = 0.33 (the classical Wilson-Hilferty value), 


and for three A values. 


Zh 





IfIit. ANALYSIS 


The methods for obtaining 95% confidence intervals for 
the probability that a repair time exceeds h = 4 hours de- 
scribed in chapter 1 were performed on simulated data having 
various distributions; these distributions were described in 
chapter 2. Simulation results for each method are shown in 
Tables 1 to ll. 

If we examine these tables case by case, we can find 
these results: 

(A) Log-normal data: All three methods work very well 
for this case except BT1l and BT3; they seem to consistently 
have less than nominal coverage. The simple delta method 
exhibits good coverage, and always has relatively small 
average width, and also low standard deviation. It seems to 
work as well as JK and BT4 for small sample sizes (n = 10, 
20). In large sample sizes all methods agree in their cover- 
age except for BTl. The method BT4 always appears to exhibit 
over-coverage. 

(B) Stretched long-tailed exponential data: Table 5 
shows that JK and BT4 exhibit over-coverage when the sample 
size n = 10. JK, DL, BT4 appear to exhibit correct coverage 
for n = 20, 30, 40, the others don't; especially at sample 
Size n = 40 they are very poor. Also there is decreased 


coverage for all methods when sample size increases; the 


ae 





JK and BT4 methods have lower average width than does DL, 
when sample size increases. This is results of the bias. 

(ey tEsseomencial case: Tables 6, 7, 8 show that the log- 
transformation may give very poor results for the exponential 
case, especially at Table 7. If we examine Table 6 and Table 
8, these tables show DL, JK, BT4 work well for small sample 
eezes. Tables 9, 10 and 11 indicate that the power transfor- 
mation works better than the log transformation. The JK, 
BT2, and BT4 methods always have better coverage than the DL 
method. Also all methods agree in their coverage when the 
sample size increases, as was true for the log-normal case. 


Generally JK and BT4 exhibit acceptable coverage. 
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foete ft: Simudation results for lcg-Normal (u=1.,0=1.) case. 
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10 
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0.96 20 


0.9450 
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Simetiation results for 


h=4.0 
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Pooee 3: Saimulacion results for log-Normal (y=1.,0=.48) ca 
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Table 4: Simulation results for log-Normal (y=1. ,c=1. 44) 
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taoere 5: Sam lation results for Stretched Long=-taile3 


PEepenential. 


h=4.0 =3 225 C=90.1948 

Sample Averag? Std. Dev 
Sage Me=hnod Coverag=2 Width Va deh 
10 28 Of '5.30 0.4341 0250 
JK Gr 7 20 4973 9.1094 

Br Orso 50 0.4404 O10 52 
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3T4 0.942) 0.2445 0.0309 

4d DL G2 943) Oe 230 1 0.0087 
Jk C .916) Were 115 0.0245 
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BT2 0.8940 022063 ).Q2u4 

5T3 Tee sys) oye. O220e5 0.0258 

BT4 0.924) Dies hore t 0.0241 


ae 





Meaeee 6: Samutation results for EXP(\) case using Log 


ransformmation. 


P(X>4.0)=0.4148 


Sample Avetage Std.dey Poeux = 
Sa Methed Coverage aidth Width Estimate 
10 DL 0.9480 90.4432 Oley SASS Oe 30 31 
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39 DL 0.9380 052607 2.0061 OE Sela) 4 
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B2ble 7: 


Sanulatson reisul<~s “or 


treansfortation. 


Sample 
Size 


19 
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39 
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h=4.0 


Mothod 


DL 


A=0 ..13 


Coverage 
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Table 8: Simulation results for EXP(\) case using Log 


tranusitorMazior. 


h=4.0 \=0 .26 B(X>4.0) =0.3535 
Sample Average Std.Dev Poi t 
Size Method Coverage Wad <n Width Estinate 
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Baz 0. 9260 0.4478 0. 1062 Oe Oy 
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Bate 9: Sahuitation results for EXP()) case using xP 
moa nhSrosnaet 1on. 
h=4.0 A=9. 22 p=0.33 
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Pemstle 12: 
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Mao le 11:3 
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IV. EXAMPLE: APPLICATION TO OPERATIONAL DATA 


In this chapter four methods were applied to a real data 
set. The methods are Binomial (BN), Delta (DL), Jackknife 
(JK), Bootstrap (see section I-b); specifically, the data 
refer to recovery times from loss of offsite power at nuclear 
plants. The problem was to estimate survival probabilities 
SeCe—mino,ees0, 2.5, 3.0, 3.5, 4.0 (hours). Data points 
(n = 42) are shown in appendix D. We initially applied 
several statistical goodness-of-fit tests to inquire into 
the evidence for the adequacy of the Log-Normal distribution 
as a model for these data. The results of these goodness of 
Piestests arevas follows: 

(1) Chi-square test: See Arnold, D. [Ref. 5]; this ac- 
cepts the log-normal model at the significance level 
= Or D:. 

(2)  Kolmogorov-Smirnov test: See Arnold, D. [Ref. 5]; 
this test rejects the log-normal model with the tabu- 
wee Ge— 0510567 and test statistic DP = 0721 for 


eee ec-oMapiro test: See Hahn, G. J. [Ref. 6]; this 
test accepts log-normal model for a = 0.05. 


We applied four estimation methods to these data, utiliz- 
ing the log-normal assumption. The results are shown in 


Table 12. 
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Taere 12: 


BN 


Sr 
BrZ 
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Sstimation 
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Recovery Time Example Results 
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V. CONCLUSIONS 


The Delta, Jackknife and Bootstrap methods applied to the 
log-normal model work well when down or repair times are 
erumy log-normal. Especially notice that DL, JK, and BTr4 
seem to work much better than BTl, BT2, BT3. Recall that 
these procedures do not appear sensitive to the population 
Waewance; See Section III. It is comparatively easy to use 
em ana ST4 when sample size is small (n = 10, 20). The delta 
method is always convenient, but especially when the sample 
is large (n = 40 or more) because it is a very simple pro- 
cedure to apply, requiring much less computation than the 
others. As Table 12 shows, the Binomial method gives some 
idea of the survival probability for practical purposes. 

Note that Binomial confidence limits are much wider than 
those that assume the log-normal model. 

Use of the log transform on exponential data produces 
biased estimates of survival. Use of the power transforma- 
tion with (p = 1/3) always gives a better coverage of the 
SUrVival probability when data are exponential. One proce- 
dure was described in Appendix C for estimating the p value 
from data. Table 13 gives simulation results for the expo- 
nential case. As our results show, this procedure is not 
estimating p value correctly. If this procedure were to work 


Somececciy (12 it could be calibrated) then we could use the 
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x” transformation (for converting data towards the normal 


form) without making any assumption (e.g. this data coming 
from exponential or gamma or log-normal, etc.). Then after 
this is done, methods DL, JK, BT4 might produce considerably 


better confidence limits for the actual survival probability. 
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APPENDIX A 


COMPUTER PROGRAMS 


Simulation programs consist of two main programs for 
three methods (DL, JK, BT). These main programs compute 
Survival probability confidence limits, and scores the cover- 
age for each replication. Then, after 1000 replications the 
program computes the statistics of these parameters and 
prints out the results. 

There is another program, called SURVP. This program 
computes point estimates, confidence limits, and widths of 
confidence limits on survival probability, using the BN, DL, 
JK, BT procedures on a given data set, under the log-normal 
model assumption. 

Variables List: 
R = Down or repair times. 
Rl = Log of down or repair times. 


RBAR = Mean of down or repair times. 


RSD = Standard deviation of down or repair times. 
GHAT = Point estimation of gq parameter (see 1.17) for delta 
method 


GJK = Point estimation of gq parameter for jackknife method. 


GBOOT = Point estimation of q parameter for bootstrap 
method. 


VARG = Variance of point estimation for delta method. 
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SE = Standard error of point estimation for jackknife 
method. 


PHAT = Point estimation of survival probability. 
BUP = Upper confidence limit estimation of gq parameter. 
BLOW = Lower confidence limit estimation of g parameter. 


CUP = Upper confidence limit estimation of survival 


probability. 

CLOW = Lower confidence limit estimation of survival 
Probability. 

AINT = Width of estimated confidence limits of survival 
Seowpapility. 

G = Pseudovalues. 

F = Bootstrap replications. 

N = Number of data points. 

Nl = Number of replications. 


N2 = Number of bootstrap replications. 
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RESAMPLING 
DO 32 JJ=1,N2 
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BOOTSTRAP REPLICATIONS 


~~ 
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CALL GGUDCDSEED,K,NR,IR) 
(Jl) 

ONTI 

ALL 


DO 33 Jil 
BOOTSTRAP SAMPLE 


(GH»yN2,TOPTsSTAT, IER) 


—GHAT 
GH,N2) 


ALOG(HI-STATOLIIVZCSTAT (5) **.5) 
COMPUTE CONFIDENCE LIMITS 
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LOG-NORMAL AND STRETCHED LONG-TAILED 
EXPONENTIAL DISTRIBUTION 


(1) Let x be log-normal random variable which I1n(x) 


N(u,0°) k moment of x as follows: 


B(x] = exp(kyu + Week o | 


see [Ref. 7] so 


E{x] = exp(p + 1/2+0°) 
E[x*] = exp(2u + 2+9°) 
var[x] = (E[x])? (e% - 1) 


Poem e@aqaErticlent Of variation” as follows: 


Weve es 32 
(E[x])? 


(2) Let W be stretched long-tailed exponential variable 


Wien W = A+z(l + C-z) where z is unit exponential and A and 


G@arme Comstane. If we write CDF for this distribution as: 
P(WSw) = P[A-z(1 + C+z) 5 w] = P[ZSz(w)] 

W = A-z(l + C°z). If we solve this equation for z we can get 

z(w) as 


-A ti Ve + 4ACw 


Z(w) = IAC 


a3 





E{w] 


A(1l + 2C) 


Dee erteiene 2c + 24) 
Var[{w] = Ac fi + 8C + 20C] 

2 

(een - Po Viet let OC 2 0C 


(E[w])? 1+ 4c + 4c¢ 


If we look at was a log-normal (u,07) variable then 


ice 2oe- 2 
L + e4Ge+ 4c? 


We can get C value from this equation, then 


Wi mee eye = ot 2° 


gives A value. 
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APPENDIX C 


POWER TRANSFORMATION 


Tae problem is in the x” transformation (toward the nor- 
mal form) finding the p value for given data. One method 
for finding the p value is as follows [Ref. 8]: 

Ry rXoree eX, data points and M = Median of this data. 


(1) Take order statistic of given data 





< oe 
epee * (2) * * *(n) 
(2) Then compute qs values as 
x ; Mi Viet (ie ae. <a 
( Pee sae) eee) 
where M is the median of the data. 
(3) Take the median or mean of aa5 
1p 
@e=— mediani(a.) or gq = : rece 
i 5 21 Bs 
(4) Then we can get p value as p=1- dq. 


Table 13 givés simulation results for this algorithm for 
the exponential case. For the exponential case the best p 
value is p = 1/3. Simulation results do not give this value 


See seatcorithm is not working correctly. 


a5 





Gmageee 13> Si eulation results ‘foc 


ry 


algorithm using 


Exvocnential data. 


A=0 222 

Sample Average Variance Skewness 
Saiz 2 of P of P of P 
19 0.74 3u 0.9174 Om 75.35 
20 Ueno 39 Oi Oy ars o 
30 Un 3136 Oa2t55 Vesa 
4Q Ga5il2.) Olisat 2 1eo2o8 
59 9.4903 Ort 97 1.4981 
60 0.4859 Que laid levees 
79 OZ O56 Cuore 1.5475 
80 0.4571 0.0814 126689 
20 0.4498 Or0 710 es. Or 
100 0.4470 WEP Sey sis, 1.3234 
110 0.4403 Ge So ago to 
120 0.4323 20515 1.4446 
130 9.4298 0.04 34 He 3 15U 
140 Ce 4250 020452 1.4466 
150 9.4196 0503 94 leo G4 
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Heda eal B)IED.< 1B, 


MEAN SQUARE ERROR OF SURVIVAL PROBABILITY 


Mean-Square errors were calculated for exponential case 
using the log transformation for three values and h = 4.0. 
The procedure is as follows: 

(1) Generate exponential sample Xp rXgreee eX, 
Meanie, ...,250. 


(2) Find actual survival probability as: 


P(x>h) = me? 


(3) Estimate survival probability (incorrectly) as: 


Boe E 
B(x>h) = e oe 
Inh-@ v2m 
er 
(4) Calculate (P(h) - B(h))* 


(5) Repeat this procedure 1000 times. 
“~ 
(6) Calculate MSE as: 
ex 1 ne 


MSE = => ie 


2) ea) 
iL 
1=1 
Simulation results for mean square error of survival 


probability were shown in Tables 14, 15 and 16. 


5) 





Poem oqialS 25res Of SULViVel probabzlicy 


For EXP(A) case using Log *ransforma*ion. 


h=4 .0 A=0.22 

Sample Mean Square SGuars Rot 
S LATS Er sO Sos) Of JMS 
10 0.0166 0.1285 
20 0. UCs5 0509729 
30 0.0063 020795 
40 0.0056 0.9746 
59 0.0049 0.0699 
60 0.0045 VU. 06738 
70 0.9041 0.9642 
80 0.0040 020635 
90 v0 CeG 0.0614 
109 0.0028 OFS IO eth 
pe) 0.0037 0.7604 
120 050,035 0.9594 
130 0.0034 O205.85 
140 0.0033 0270573 
1350 020033 O70 575 
160 H200s2 O20 572 
179 0.0032 0.9564 
180 0.0031 0.0560 
190 0003.2 0.9562 
209 0.9032 0.0563 
210 0.0031 O20 | 
220 9.0031 0.0557 
239 o..9939 US OS Sys 
240 0.0031 W055 7 
250 0210031 029553 





Pe eogdare EXtOr of Survival probpaprility 
p 


fo= EXP(X) case using Log transformacion. 
n=4.0 A=0.73 
Sample Mean Square Squats ce dee = 
Sizé BD SOr (lee) of MSE 
19 0.0234 O53 
20 0. O42 i, tiles 2 
30 0.0098 0.0991 
49 0.0089 0.9942 
50 0.0081 9.0897 
60 0.0076 050869 
70 0.9070 0.0834 
89 0.0968 9.0824 
90 0.0065 350203 
1090 0.0064 0.0802 
110 0.016 3 0207792 
120 0.0061 Oe 734 
1390 0. 0059 0.0770 
140 OPT a 0.9765 
159 0.0058 0.9762 
Tied 0.9058 Oda 58 
170 Ong! G20953 
189 OE.0 6 55 e074 5 
1990 0. O56 9.0746 
200 0%. 0656 O59 747 
ZA0 0.9056 0.9747 
229 OF0G 55 0.9744 
230 0.0056 Cages 5 
249 O25 5 Gaon 43 
250 9.9054 020737 


a, 





Meme o> Mean Square Exro= o£ Survivel probability 


fous oke (i) Case Be=ng Log transfornazior. 


h=4.0 A=0.25 

Sample Mean Square Sauass Roos 
Beze Bt 7 Oneauiors) of MSE 
10 Qen01.3 5 0.1169 
20 0.0066 Oe Genz 
39 0.0047 0.9688 
40 0.0041 0.0637 
50 0.0034 0.9585 
69 0.0031 Oe S)ey3, 
79 90.0028 0.026 
&0 0.0027 0.0519 
99 0500.25 9.0495 
100 9.0024 0.9492 
110 020023 0.0483 
120 On 01022 0.0472 
130 0.0022 0.0465 
149 0.0021 0.0455 
150 0.0021 0.94523 
eo 0.90290 0.0448 
170 9.0019 0.90439 
139 020019 0.0435 
190 Oe0019 0.0437 
2990 OSC 0.90438 
219 0.0019 Oe 25 
220 020019 00431 
239 020049 ja 43 1 
ENG, 0.0018 0.0429 
250 0.09018 0.0425 
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plants: 


24. 


Oo - 


6160 


6166 


BA. oieil 
. 4660 
.6666 
- 4833 
Slee 0,0) 
.0666 


as Se. 


22 


DATA POINTS FOR EXAMPLE 


These are recovery times 


.6660 
. 2000 
. 8000 
a2o3 3 
hee 
2W02Zs 
.6660 
.0166 


.0041 


APPENDIX E 


Pes 


(hours) 


0830 


shop 3)3 
woo 0 
.0000 
eS.3 
Be Bee: 
7 S00 


~2000 


at 


20033 
.U332 
seis) S10. 
7000 
-0166 
mye ey 
. Logs 


. 9000 


from LOSP at nuclear 


soe 
Oo 0¢ 
iO 
. 1666 
feo 
Peco) oye, 
70166 


TNS: 





Pilots CE Snare ERENCES 


Mectco iter, F., and Tukey, J. W., Data Avia by sic mcm ne = 
gression, Addison-Wesley, 1977. 


Cramér, Harald, Mathematical Methods of Statistics, 
Princeton University Press, 1946. 


Meeceltler, F., and Tukey, J. W., Data Analysis, Including 
Statistics, Addison-Wesley, 1968-1970. 


PeeeMeomactey, Tne Jackknife, The Bootstrap, and Other 
Peepers Ebans, Technical Report No. 63, Stanford Uni- 


versity, Department of Statistics, December 1980. 


femoid;, DBD. Allen, Probability, Statistics and Queueing 
tee oevewte Computer Science Applications, 1978. 


Hahn, G. J., and Shapiro, S. S., Statistical Models in 
Enginesrang, John Wiley and Sons, Inc., New York, 1967. 


womnson, N. L., and Kotz, S., Continuous Univariate Dis- 
Petemeeons, vol. 1., John Wiley and Sons, Inc., 1970. 


Emerson, Jonn D., and Stoto, Michael A., Journal of the 


American Statistical Association, Vol. 77, Number 377, 
Mareh 1982. 


62 





{NETIAL DISTRIBUTION LIST 


No. Gopies 


Defense Technical Information Center 2 
Cameron Station 
Alexandria, Virginia 22314 


Library, Code 0142 2 
Naval Postgraduate School 
Moncerey, California 93940 


Department Chairman, Code 55 i: 
Department of Operations Research 

Naval Postgraduate School 

Monterey, California 93940 


Professor D. P. Gaver, Code 55Gv 5 
Department of Operations Research 

Naval Postgraduate School 

Monterey, California 93940 


Bogazici Universitesi 1 
Yon Eylem Analizi Bolumu 
Besiktas, Istanbul, Turkey 


DZ Koka 1 5) 
Bakanliklar, Ankara, Turkey 


Deniz Harpokulu K.ligl il 
Heybeliada, Istanbul, Turkey 


Professor P. A. Jacobs, Code 55Jc 1 
Department of Operations Research 

Naval Postgraduate School 

Monterey, California 93940 


DP Cora Deniz it 


@vJakeorteesi 27, Block D.l 
Yenilevend, Istanbul, Turkey 


6 3 





Or 


SUI 


Dee 


Chief of Naval Research 
Meatrngton, VA 22217 


Dean of Research 

Code O12A 

Naval Postgraduate School 
Monterey, CA 93940 


beorary, Code 55 
Naval Postgraduate School 
Monterey, CA 93940 


64 


No. Copies 

















199692 


Thesis 
C15454 Cora 


engi! Estimating survival 
probability or relia- 
bility: simulation 
assessments of the 
delta method, jack- 
knife, and bootstrap. 





